Goto

Collaborating Authors

 Kurdistan Province


Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a survey

Abdullah, Abdulhady Abas, Zubiaga, Arkaitz, Mirjalili, Seyedali, Gandomi, Amir H., Daneshfar, Fatemeh, Amini, Mohammadsadra, Mohammed, Alan Salam, Veisi, Hadi

arXiv.org Artificial Intelligence

This review surveys the rapid evolution of Meta AI's LLaMA (Large Language Model Meta AI) series - from LLaMA 1 through LLaMA 4 and the specialized parameter-efficient fine-tuning (PEFT) methods developed for these models. We first describe the LLaMA family of foundation models (7B-65B to 288B parameters), their architectures (including native multimodal and Mixtureof-Experts variants), and key performance characteristics. We then describe and discuss the concept of PEFT, which adapts large pre-trained models by updating only a small subset of parameters, and review five PEFT methods that have been applied to LLaMA: LoRA (Low-Rank Adaptation), LLaMA-Adapter V1 and V2, LLaMA-Excitor, and QLoRA (Quantized LoRA). We discuss each method's mechanism, parameter savings, and example application to LLaMA (e.g., instruction tuning, multimodal tasks). We provide structured discussion and analysis of model and adapter architectures, parameter counts, and benchmark results (including examples where fine-tuned LLaMA models outperform larger baselines). Finally, we examine real-world use cases where LLaMA-based models and PEFT have been successfully applied (e.g., legal and medical domains), and we discuss ongoing challenges and future research directions (such as scaling to even larger contexts and improving robustness). This survey paper provides a one-stop resource for ML researchers and practitioners interested in LLaMA models and efficient fine-tuning strategies.


A Comprehensive Part-of-Speech Tagging to Standardize Central-Kurdish Language: A Research Guide for Kurdish Natural Language Processing Tasks

Sabr, Shadan Shukr, Mustafa, Nazira Sabr, Omar, Talar Sabah, Rasool, Salah Hwayyiz, Omer, Nawzad Anwer, Hamad, Darya Sabir, Shams, Hemin Abdulhameed, Kareem, Omer Mahmood, Abdullah, Rozhan Noori, Abdullah, Khabat Atar, Mohammad, Mahabad Azad, Al-Raghefy, Haneen, Asaad, Safar M., Mohammed, Sara Jamal, Ali, Twana Saeed, Shawrow, Fazil, Maghdid, Halgurd S.

arXiv.org Artificial Intelligence

- The field of natural language processing (NLP) has dramatically expanded within the last decade. Many human-being applications are conducted daily via NLP tasks, starting from machine translation, speech recognition, text generation and recommendations, Part-of-Speech tagging (POS), and Named-Entity Recognition (NER). However, low-resourced languages, such as the Central-Kurdish language (CKL), mainly remain unexamined due to shortage of necessary resources to support their development. The POS tagging task is the base of other NLP tasks; for example, the POS tag set has been used to standardized languages to provide the relationship between words among the sentences, followed by machine translation and text recommendation. Specifically, for the CKL, most of the utilized or provided POS tagsets are neither standardized nor comprehensive. To this end, this study presented an accurate and comprehensive POS tagset for the CKL to provide better performance of the Kurdish NLP tasks. The article also collected most of the POS tags from different studies as well as from Kurdish linguistic experts to standardized part-of-speech tags. The proposed POS tagset is designed to annotate a large CKL corpus and support Kurdish NLP tasks. The initial investigations of this study via comparison with the Universal Dependencies framework for standard languages, show that the proposed POS tagset can streamline or correct sentences more accurately for Kurdish NLP tasks.


On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise

Eslami, M. Reza, Biswas, Dhiman, Takhtardeshir, Soheib, Sharif, Sarah S., Banad, Yaser M.

arXiv.org Artificial Intelligence

This paper presents a memristor-based compute-in-memory hardware accelerator for on-chip training and inference, focusing on its accuracy and efficiency against device variations, conductance errors, and input noise. Utilizing realistic SPICE models of commercially available silver-based metal self-directed channel (M-SDC) memristors, the study incorporates inherent device non-idealities into the circuit simulations. The hardware, consisting of 30 memristors and 4 neurons, utilizes three different M-SDC structures with tungsten, chromium, and carbon media to perform binary image classification tasks. An on-chip training algorithm precisely tunes memristor conductance to achieve target weights. Results show that incorporating moderate noise (<15%) during training enhances robustness to device variations and noisy input data, achieving up to 97% accuracy despite conductance variations and input noises. The network tolerates a 10% conductance error without significant accuracy loss. Notably, omitting the initial memristor reset pulse during training considerably reduces training time and energy consumption. The hardware designed with chromium-based memristors exhibits superior performance, achieving a training time of 2.4 seconds and an energy consumption of 18.9 mJ. This research provides insights for developing robust and energy-efficient memristor-based neural networks for on-chip learning in edge applications.


Language and Speech Technology for Central Kurdish Varieties

Ahmadi, Sina, Jaff, Daban Q., Alam, Md Mahfuz Ibn, Anastasopoulos, Antonios

arXiv.org Artificial Intelligence

Kurdish, an Indo-European language spoken by over 30 million speakers, is considered a dialect continuum and known for its diversity in language varieties. Previous studies addressing language and speech technology for Kurdish handle it in a monolithic way as a macro-language, resulting in disparities for dialects and varieties for which there are few resources and tools available. In this paper, we take a step towards developing resources for language and speech technology for varieties of Central Kurdish, creating a corpus by transcribing movies and TV series as an alternative to fieldwork. Additionally, we report the performance of machine translation, automatic speech recognition, and language identification as downstream tasks evaluated on Central Kurdish varieties. Data and models are publicly available under an open license at https://github.com/sinaahmadi/CORDI.


Towards Cohesion-Fairness Harmony: Contrastive Regularization in Individual Fair Graph Clustering

Ghodsi, Siamak, Seyedi, Seyed Amjad, Ntoutsi, Eirini

arXiv.org Artificial Intelligence

Conventional fair graph clustering methods face two primary challenges: i) They prioritize balanced clusters at the expense of cluster cohesion by imposing rigid constraints, ii) Existing methods of both individual and group-level fairness in graph partitioning mostly rely on eigen decompositions and thus, generally lack interpretability. To address these issues, we propose iFairNMTF, an individual Fairness Nonnegative Matrix Tri-Factorization model with contrastive fairness regularization that achieves balanced and cohesive clusters. By introducing fairness regularization, our model allows for customizable accuracy-fairness trade-offs, thereby enhancing user autonomy without compromising the interpretability provided by nonnegative matrix tri-factorization. Experimental evaluations on real and synthetic datasets demonstrate the superior flexibility of iFairNMTF in achieving fairness and clustering performance.


From Data to Insights: A Comprehensive Survey on Advanced Applications in Thyroid Cancer Research

Zhang, Xinyu, Lee, Vincent CS, Liu, Feng

arXiv.org Artificial Intelligence

Thyroid cancer, the most prevalent endocrine cancer, has gained significant global attention due to its impact on public health. Extensive research efforts have been dedicated to leveraging artificial intelligence (AI) methods for the early detection of this disease, aiming to reduce its morbidity rates. However, a comprehensive understanding of the structured organization of research applications in this particular field remains elusive. To address this knowledge gap, we conducted a systematic review and developed a comprehensive taxonomy of machine learning-based applications in thyroid cancer pathogenesis, diagnosis, and prognosis. Our primary objective was to facilitate the research community's ability to stay abreast of technological advancements and potentially lead the emerging trends in this field. This survey presents a coherent literature review framework for interpreting the advanced techniques used in thyroid cancer research. A total of 758 related studies were identified and scrutinized. To the best of our knowledge, this is the first review that provides an in-depth analysis of the various aspects of AI applications employed in the context of thyroid cancer. Furthermore, we highlight key challenges encountered in this domain and propose future research opportunities for those interested in studying the latest trends or exploring less-investigated aspects of thyroid cancer research. By presenting this comprehensive review and taxonomy, we contribute to the existing knowledge in the field, while providing valuable insights for researchers, clinicians, and stakeholders in advancing the understanding and management of this disease.


Clustering of Urban Traffic Patterns by K-Means and Dynamic Time Warping: Case Study

Etemad, Sadegh, Mosayebi, Raziyeh, Khodavirdian, Tadeh Alexani, Dastan, Elahe, Telmadarreh, Amir Salari, Jafari, Mohammadreza, Rafiei, Sepehr

arXiv.org Artificial Intelligence

Clustering of urban traffic patterns is an essential task in many different areas of traffic management and planning. In this paper, two significant applications in the clustering of urban traffic patterns are described. The first application estimates the missing speed values using the speed of road segments with similar traffic patterns to colorify map tiles. The second one is the estimation of essential road segments for generating addresses for a local point on the map, using the similarity patterns of different road segments. The speed time series extracts the traffic pattern in different road segments. In this paper, we proposed the time series clustering algorithm based on K-Means and Dynamic Time Warping. The case study of our proposed algorithm is based on the Snapp application's driver speed time series data. The results of the two applications illustrate that the proposed method can extract similar urban traffic patterns.


CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation

Alam, Md Mahfuz Ibn, Ahmadi, Sina, Anastasopoulos, Antonios

arXiv.org Artificial Intelligence

Neural machine translation (NMT) systems exhibit limited robustness in handling source-side linguistic variations. Their performance tends to degrade when faced with even slight deviations in language usage, such as different domains or variations introduced by second-language speakers. It is intuitive to extend this observation to encompass dialectal variations as well, but the work allowing the community to evaluate MT systems on this dimension is limited. To alleviate this issue, we compile and release \dataset, a contrastive dialectal benchmark encompassing 882 different variations from nine different languages. We also quantitatively demonstrate the challenges large MT models face in effectively translating dialectal variants. We are releasing all code and data.


Transfer Learning for Low-Resource Sentiment Analysis

Hameed, Razhan, Ahmadi, Sina, Daneshfar, Fatemeh

arXiv.org Artificial Intelligence

Sentiment analysis is the process of identifying and extracting subjective information from text. Despite the advances to employ cross-lingual approaches in an automatic way, the implementation and evaluation of sentiment analysis systems require language-specific data to consider various sociocultural and linguistic peculiarities. In this paper, the collection and annotation of a dataset are described for sentiment analysis of Central Kurdish. We explore a few classical machine learning and neural network-based techniques for this task. Additionally, we employ an approach in transfer learning to leverage pretrained models for data augmentation. We demonstrate that data augmentation achieves a high F$_1$ score and accuracy despite the difficulty of the task.


Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction

Tabakhi, Sina, Moradi, Parham

arXiv.org Artificial Intelligence

The Universal Feature Selection Tool (UniFeat) is an open-source tool developed entirely in Java for performing feature selection processes in various research areas. It provides a set of well-known and advanced feature selection methods within its significant auxiliary tools. This allows users to compare the performance of feature selection methods. Moreover, due to the open-source nature of UniFeat, researchers can use and modify it in their research, which facilitates the rapid development of new feature selection algorithms.